MEAN-FIELD SPIN GLASS MODELS FROM THE CAVITY-ROST 

PERSPECTIVE 

MICHAEL AIZENMAN, ROBERT SIMS, AND SHANNON L. STARR 



Abstract. The Sherrington-Kirkpatrick spin glass model has been studied as a source 
of insight into the statistical mechanics of systems with highly diversified collections of 
competing low energy states. The goal of this summary is to present some of the ideas 
which have emerged in the mathematical study of its free energy. In particular, we high- 
light the perspective of the cavity dynamics, and the related variational principle. These 
are expressed in terms of Random Overlap Structures (ROSt), which are used to describe 
the possible states of the reservoir in the cavity step. The Parisi solution is presented as 
reflecting the ansatz that it suffices to restrict the variation to hierarchal structures which 
are discussed here in some detail. While the Parisi solution was proven to be correct, 
through recent works of F. Guerra and M. Talagrand, the reasons for the effectiveness of 
the Parisi ansatz still remain to be elucidated. We question whether this could be related to 
the quasi-stationarity of the special subclass of ROSts given by Ruelle's hierarchal 'random 
probability cascades' (also known as GREM). 
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0. An outline 

The Sherrington-Kirkpatrick spin glass model has been studied as a source of insight 
into statistical mechanics of systems with highly diversified collections of patterns for the 
minimization of the free energy, or energy. The model is based on a Hamiltonian which 
incorporates interactions with high levels of frustration and disorder The goal of this 
article is to present some of the ideas which have emerged in the study of the SK model, 
and in particular highlight an approach for the analysis of its free energy influenced by the 
cavity perspective. 

The discussion is organized as follows. 

In Section 1 we present the Sherrington-Kirkpatrick model \22], and comment on some 
of its basic features and puzzles. A more general version of the model is presented in 
Appendix C. Among the essential features exhibited by these models is the presence of 
rich diversity of low energy configurations. A proposal for a solution of the SK model 
was developed in a series of works, driven by the astounding insight of G. Parisi 1 17|. An 
essential feature of the proposed solution is the ansatz that at low temperatures the model's 
Gibbs states exhibit a hierarchal structure. The Parisi approach was further clarified by 
Mezard, Parisi and Virasoro L15J . and proceeding through Derrida's REM and GREM 
calculations (Sj which have in turn motivated Ruelle's construction 1201 . 

In Section 2 we present the cavity perspective and show that it naturally leads to the 
random overlap structure as the order parameter An order parameter is a quantity which 
captures an essential feature of the system, whose determination provides key information 
on the system's state. In the ferromagnetic Ising model the role is usually played by the 
magnetization. Parisi has presented his solution of the SK spin glass model as involving 
an order parameter which is a monotone function of the unit interval. That, however, 
presupposes 'ultrametricity' or the hierarchal structure discussed below. Without such an 
assumption, we argue that the natural order parameter is a ROSt. 

Section 3 presents interpolation techniques. The considerable recent progress in the 
mathematical study of the SK model was stimulated, and indeed enabled, by the interpola- 
tion argument which was introduced in the work of F. Guerra and F. L. Toninelli [ IIJ . The 
basic tools are presented here within the context of ROSt. 

In Section 4 we discuss a variational formulation of the solution fT\, which starts with 
an extension to general ROSts of the remarkable statement of F. Guerra 1 10 1 that Parisi's 
ansatz provides a rigorous bound. The extended variational principle is shown to provide 
the correct answer, but is not computationally effective. 

In part 5 we present a hierarchal "random probability cascade" (RPC) model, following 
closely a construction which was formulated by D. Ruelle. For ROSt within this class, 
the variational quantity can be presented as a functional defined over monotone functions 
(equivalently, probability measures) over the unit interval. The Parisi solution can be ex- 
plained as based on the ansatz that it suffices to restrict the variation to ROSts in that class. 

The Parisi expression for the free energy was proved by M. Talagrand to be correct 1 25 1 . 
This was established through the criterion which is provided by Guerra's interpolation 
bound. However, the notable result still does not fully address the challenge of explaining 
the reasons for the validity of the Parisi ansatz. In part 6, we comment on that, and on 
the question whether the reasons for the validity of Parisi's ansatz could be related to the 
remarkable quasi-stationarity of the hierarchal RPC under the dynamical process which is 
naturally associated with the cavity picture |21jj. 
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1. The Sherrington-Kirkpatrick spin glass model: Basics 

1.1. Formulation of the model. Spin glass models were formulated in an attempt to pro- 
vide analyzable and instructive examples of systems with intricate dynamics and equilib- 
rium states of rich structure. A prime example is the Sherrington-Kirkpatrick model |22|, 
whose configurations are described by N spin variables {<Ti}i=i^,,,,N taking values ±1, 
and interacting via the random Hamiltonian: 

-HM{(J\LO,h) = -^'^Ji-j{uj)aiaj +h'^ai, (1.1) 

where Jij{Ld) are iid gaussian random variables with normal distribution, and /i is a real 
parameter 

Since its introduction, the model has attracted considerable discussion and shown itself 
to contain various surprises. Even before one addresses the complex structure which the 
model aims to express, one encounters certain basic questions for which the answer is not 
immediate. 

1 .2. Comments on the ground state energy. The normahzing factor N~^/^, included in 
eq. ensures that the lowest energy 

£N{uj,h) = mmHN{<J-U!,h) 

is typically of order N, even when h — 0. However, the fact that this is so requires an 
argument, since for any a-priori chosen configuration the typical order of magnitude of the 
energy is only 0{N^^^) (the distribution of the collection of energies is symmetric under 
reflection, due to the invariance of the distribution of {J} under: J. — > — J..) The resolu- 
tion of this issue is easy. However, some of the next questions are not so simple: 

[Q 1.] Is the random variable £jv(cc', h)/N sharply distributed, for high N, and does it 
converge in distribution to a constant as ^ cx3? 

The answer to both questions is "yes" (see 1241 and references therein) though for the 
second question it took considerable time for the answer to be established rigorously. The 
task was accomplished in a clever and simple argument of Guerra and Toninelli 1 1 1 1 . 

[Q 2.] Compute, or give an effective way to estimate, the distributional limit (which 
does exist), 

Eo = lim £N{uJ,h)/N . 

[Q 3.] Produce an algorithm for determining the energy minimizing configuration, or at 
least find one for which the resulting energy per volume {a; uj)/N is close to Bo- 
lt turns out that in order to determine Eo, either theoretically or numerically, it is essen- 
tial to consider the equilibrium states of the model at positive temperatures, which are, of 
course, of intrinsic interest. Thus, one is led to consider: 
• the partition function. 
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• the quenched free energy, which is (— /3) times the following quantity 

QNW,h) = E(logZjv(/3;a;,/i)) 

where E( ) represents the average over the random 'environment' ui. 

A derivative of the free energy yields the mean value of the energy density in the 
quenched state: 

EN{f3,h) := j^E{SN{f3;u;,h)) = -^Q^{f],h)/N , (1.2) 
where £N{f3; lo, h) is the Gibbs state average energy density 

— P H N {<y.^^ -h) 

£N{f3;uj,h) = Y.HN{a;u;,h)—— — . (1.3) 

Standard convexity arguments imply that the mean is, at almost every inverse- temperature 
(3, also the typical value of the energy density. More can be said on the so called self 
averaging property of En {l3;uj,h) through the 'concentration of measure' principle which 
is nicely presented in the book of Talagrand 1241 . 

1.3. Diversity. Before we turn to the more detailed discussion of the free energy, let us 
comment on the question concerning explicit algorithms for finding low energy configura- 
tions. Two natural algorithms, which are discussed in greater detail in 1 1 1 for h — 0, are: 
/. The greedy algorithm: Ci is determined successively, with respect to some order of 
the indices, by optimizing at each step the sign of the contribution of the new terms. For 
instance, 

(Ti = +1 , and for i = 2, TV: (7^ = — sign < Jij'^j ( ■ (1-4) 

a. The eigenstate- shadowing algorithm: 

CTj = sign7/ij(tj) (1.5) 

with iJi{uj) one of the lowest eigenstates of the Hermitian matrix Jy (ut), which is sampled 
with the GOE distribution. 

A point to be appreciated here is that while the typical spectrum of the corresponding qua- 
dratic form is known, through Wigner's celebrated semi-circle law, the non-linear problem 
of determining the minimum restricted to the vertices of the hypercube in is, at present, 
much harder. 

The greedy algorithm, which typically yields configurations with H]\!{a;LU,0)/N « 
—0.5319 can be easily improved upon, while an improvement over the second one, which 
typically yields i/jv((T; oj, 0) /N w —0.6366 1 1 1, presents a harder challenge. 

It may be noted that both algorithms allow for the construction of many very different 
configurations with comparable energy. This does not yet prove that such a diversity per- 
sists at the bottom of the spectrum, since neither yields the ground state energy per spin, 
but nevertheless the diversity seen here does offer a hint of the diversity characteristic of 
the model. Those observations lead one to the fascinating question: 

[Q 4.] How much variety is there among the low energy configurations? 



By flipping a few spins of the minimizing configuration, one can produce many con- 
figurations with energy in the range £n{<-iJ, h) + 0{1). However, the question is whether 
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one finds configurations with energies close to the ground state which are extensively dis- 
tinct from each other. For this purpose, the distance between the configurations may be 
expressed through their overlap, which is defined as 

1 ^ 

1=1 

with dist(cr, a') := 1 — q^^,. According to the Parisi picture, at the bottom of the spec- 
trum one should find a diverse collection of "competing" configurations, whose energies 
and overlaps resemble a RPC process. The description is slightly complicated by the need 
to lump the configurations into equivalence classes, according to their mutual overlaps. 
Furthermore, the discussion of the ground state is (so far) accessible only after understand- 
ing the structure of the positive temperature Gibbs equilibrium states. 

2. The Cavity Perspective 

2.1. The incremental free energy. 

It is convenient to present the pressure Pn{P, h) := Qi^{f3, a sum of incre- 

ments, which describe the effect of the gradual increase in the system's size, starting from 
Zo == 1: 

Pn{P, h) := -E(ln ^w) = ^ ^ ^{ ^) ' ^^.l) 

The sequence Pjv h) converges if and only if the sequence of increments is Cesaro- 
convergent, in which case: 

P{f3,h) := lim PN{P,h) = (c-) lim Ef In . (2.2) 

For an intuitive description of the incremental term let us describe the configuration of 
a large reservoir of N spins by the symbol a = {ai, . . . ,aN), and let the next spin be 
denoted a = a^+i- Then 

Zn+1 ^ l^aA^ 

We would like to cast the ratio as the effect of the addition of the single spin a to a reservoir 
whose state is described by a, and is governed by the Hamiltonian HN{a,cu) . First, 
however, one needs to deal with a minor inconvenience: as we go from size N to N + 1, 
the interaction in a diminishes because of the change in the normalizing factor 

J- J.. ^ 

To address this, we rewrite the interaction Hn{(x) in a form which will allow a natural 
subtraction: 

-HN{a) = ^^^^jf^r-^ J^j^i^j +h-a+ ^ ^ JijUiaj , (2.5) 



^Jij ^ ^— — ■ — —Jij ' (^'4) 



with Jij are independent normal Gaussians, and h is the vector with all components equal 
h. This is to be compared with 

1 ^ 1 ^ 

-i?jv+i(a,(T) = / Y 1 Jjj^i'^j +hL-a + — j^^ ^ Ji^N+iajO + ha , (2.6) 
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For brevity, let us denote the two terms which appear above as independent additions to 
HN{a), as the Gaussian random variables: 



Thus, we get 



1 ^ 



14 - 



1 ^ 



i,N+lOLi 



Ef In ^\ = e( In ^"'tf . 



where {^a} are the weights 

ia = exp {j3 



N 



^/]v" 



(2.7) 



(2.8) 



(2.9) 



Equation (12. 8> expresses the incremental contribution to the free energy in terms of the 
mean free energy of a particle added to a reservoir whose internal state is described by a, 
corrected by an inverse-fugacity term (k). The latter may be thought of as the free energy 
of a 'place holder', or a vacancy, for the cavity into which the {N + l)st particle is added. 

2.2. The cavity dynamics. One may note that the addition of a particle to the reservoir of 
N particles has an effect on the state of the reservoir. For iV >> 1, the value of the added 
spin, (T, does not affect significantly the field which would exist for the next increment in 
N, the direct contribution being only of the order 0(1/ \/N). Hence, for the next addition 
of a particle we may continue to regard the state of the reservoir as given by just the 
configuration a. However, the weight of the configuration (which is still to be normalized 
to yield its probability) undergoes the change: 



£,a ^ Co 



(T = ±l 



(2.10) 



We refer to this transformation of the state of the reservoir (i.e., its probability distribution) 

as the cavity dynamics. 

2.3. Random Overlap Structures. The state of the reservoir is relevant in so far as it 
correlates with the cavity field Va and fugacity variable Kq,. In order to keep track of 
just the relevant information, it is natural to introduce the following concept of a random 
overlap structure The definition is somewhat tentative, as we do not address here 
the possibility that the a continuum of states will be needed for the reservoir, in the limit 
N ^ oo. (One may envision an extension of the definition, but that will require addressing 
some technical issues.) Instead, we consider the case that states of the reservoir form just 
a countable collection, which we order by the weights. Even this simple concept allows to 
formulate variational bounds, and in fact even capture Parisi's ansatz. 

Definition 2.1. A random overlap structure (ROSt) is a probability space (51, /i) over which 
there are defined: i. a monotone nondecreasing sequence {C„ (w) }, and ii. an N x N matrix 
{q„.„'(a;)} such that for fi a.e. uj 

(1) en(w) > OWO < EnC{^)n < OO/ 

(2) qn,n' ('^) corresponds to a real, positive semidefinite form; 

(3) g„ „(a;) = I for all n G N (which implies | < Ij- 
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Here, for clarity of the concept, we label the states of the reservoir not by a, as above, 
but by n G N. However, as we shall see below, in the presence of the additional structure 
the somewhat vaguer notation will be convenient. We shall not change the symbol for 
the weights but rather just tacitly assume that the sequence is ordered, whereas {£,a} 
is just a collection of the weights attached to an index which may have some additional 
structure, as will be encountered below. 

2.4. The incremental free energy functional. In the above discussion, we presented the 
cavity dynamics as the process of adding a single spin. But one can also add directly M 
spins. To describe the effect of that, one may associate with each state of the ROSt new 
independent families of centered Gaussian random variables{7y^}i=i ... m and with the 
covariances 

nvWa') = kj1o.,a' , E(k„«;„0 = ^ql^,. (2.11) 

For the added M -spin configuration {(Ti}i=i,,,M we define 

M 
1=1 

Motivated by the above consideration of the incremental free energy in case the ROSt 
is just the SK system of N particle, we define the more general ROSt functional: 

G,,(^) ^£ J-E In ^ ) . (2.13) 

To ensure that this functional is well defined, let us note: 
Lemma 2.2. For any configuration of the ROSt, 

are integrable with respect to the Gaussian measure averages over Ka and Va^a (denoted 
below by E„^yj. 

Proof. To estimate the mean of the absolute value, it is convenient to use the identity, 
\X\ = X + 2\ — X\+, for X G R. Applying the Jensen's inequality to the average over a 
we get 

-lnS-|:-''^"° < S-yf^l-^l (2,5, 

With another application of the Jensen inequality, this time to the average over the Gaussian 
variables, EK(lnQ) < \nEi^{Q), we get 



E.. 



In ^"^"^ 



P^M I 

< — — + /3\/2M < cx) . (2.16) 



4 

Similar bounds apply to the second quantity in ( I2.14> . □ 

It should be clear from the above discussion and some elementary estimates, as the one 
given below, that in case the ROSt is just the system of >> M particles with the Gibbs 
equilibrium state corresponding to the SK interaction (/xjv), 

GmW) = ^E(log[Z„+M(/3,M/^Jv(/3,M]) + 0(1/M). (2.17) 
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However, rather surprisingly, it turns out that quite generally the ROSt functional provides 
an upper bound: 

Theorem 2.3 (AS^, a generaUzation of Guerra's bound JTol ). For any ROSt: 

^E(lnZM) < Gm{p) + o(l) , (2.18) 
where o(l) vanishes for M oo. 

Furthermore, one gets the following expression for the difference: 

HereEp^(-) is a double replica average which is defined in Section|3 where this proposi- 
tion is proved as part of Theorem l4.1l 

Remark: 1. There is an interesting similarity, but also contrast on which we comment next, 
between Theorem l2.3l and the Gibbs variational principle. For an arbitrary Hamiltonian 
H{(j), and the initial probability measure pi^{d a), any probability distribution on the spins, 
p{da), yields a variational lower bound for the logarithm of the partition function Z = 

InZ > Sin\po)-|3^iiH), (2.20) 

where S{p\po) is the relative entropy of p with respect to po, and p{H) is the expectation 
value of H with respect to p. The inequality is saturated (for a finite system) if and only if 
/i is the Gibbs equilibrium state p((t) = — Po{o')- 

2. It is thus curious that the ROSt variational principle yields upper bounds on the quenched 
free energy, whereas the usual Gibbs variational estimate yields lower bounds. We owe to 
Anton Bovier the interesting observation that this change may be related to one of the 
puzzles encountered in Parisi's original argument. There, in the replica calculation the 
usual role of minima and maxima are reversed due to the change of sign in n(n — 1) when 

3. As would be explained below, restricting the variational bound to the hierarchal ROSt, 
RFC, one obtains the result of Guerra 1 10 1 that Parisi's solution provides an upper bound 
on the pressure (lower bound on the free energy). 

3. Interpolation arguments 

3.1. A Gaussian differentiation formula. The derivation of the variational principle rests 
on the following differentiation formula. 

Lemma 3.1. Let F be a finite index set and {X^j^^p be a sequence of centered, gaussian 
random variables whose correlations depend on a parameter t G (0, 1).- 

Et(X^Xy) = C^,^'{t), (3.1) 

with C^^-yi{t) differentiable in t and uniformly positive (C-y,^i{t) > el) as a quadratic 
form. 

Then, for any function ip : K.*^"" — > R with continuous second partial derivatives that 
are polynomially bounded: 

|e. (* ({A-,))) ^ i E (j^) . (3^2, 
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For polynomial functions %jj the differentiation formula can be obtained rather directly 
from Wick's rule [23 1, or through the integration by parts formula for gaussian random 
variables. In appendix 1X1 we present a proof based on the Fourier transform representa- 
tion. The statement can be further extended to functions whose second derivatives increase 
slower than any inverse gaussian. 

In our applications, we will be differentiating functions of a specific form. For this 
reason, we state: 

Corollary 3.2. Let {X^j^gr be a collection of Gaussian random variables as in Lemma \3.1\ 
with 



sup sup 

t 7,7'Gr 



dt 

and {^^}^gr a summable sequence of positive numbers. Let 



< oo , (3.3) 



^7er 

with some /3 > 0. Then, for any < ti < t2 < 1 



2 



t2 r 



(3.5) 



ti 



E, 



(1) 



dt 



dt. 



where represent the "weighted replica averages ", which are defined, for bounded 
functions / : F" — s- M, Zjy 

E(")(/(7i,...,7n)):=Ef E /(^i 

with 



n \ 



(3.6) 



Ct(7) 



(3.7) 



Proof. For F a finite set, the statement is a direct application of ( 13. 2> . For infinite F = 
{71, 72, . . . } let F„ {71, . . . , 7„}. Then, as just stated. 



d_ 
dt 



E I In ^ ^7^" 
76r„ 



2 



r(i) 



-C^^^{t) 



«13 {ic^.At)) 



, (3.8) 



where the annealed multi-replica measures e|j^| and E^f | are with respect to the random 
discrete measure generated by the finite sequence 



C*(7) 



^j'er„ ^7" 



E7'er„C*(7') 



(3.9) 



for 7 € F„. Thus, the statement holds for the finite subsets F„. As n ^ 00 the random 
measures determined by (n.t converge to the random measure determined by e-g-, in the 
total variation norm. The claimed ( 13. 5> then follows using the integrated version of ( I3.8l l. 
( 13. 3> . and the bounded convergence theorem. □ 
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Remarks: 1. The subscript t above indicates that these averages depend on the external 
parameter t through the weights ( 13. 7> . 

2. The derivative of tp separates into two crucial terms. In many applications, the term 
involving the single replica average, will vanish because the variance of (i.e., 
the diagonal term) will remain constant with respect to the interpolation parameter t. For 
such cases, one sees that if the off diagonal terms Cy,^>{t) only decrease with t, then the 
function E increases in t. Stated differently: the average goes up when the variables 
X^ become less correlated. 

3. Lemma lTTI and Corollarv l3.2l are related to Slepian's inequality, c.f. 1 13 1. 

While various interesting conclusions follow from monotonicity alone, it helps to go 
beyond that. Following is a useful bound. 

Corollary 3.3. Suppose {X-y} and {Y^} are two independent sequences of centered gauss- 
ian random variables. Suppose that ip is as in Corollarv \3.2\ Then 

|E(V'({X^})) < /32max|E(X^Xy) -E(r^ry) I (3.10) 

7,7' 

Proof. Consider the Gaussian family {Z^} with covariance 

C^,y(i) = m{X^Xy) + {I - t)¥.{Y-,Yy) . (3.11) 

By CoroUarv 13.21 one obtains a formula for the derivative of E(i/)({Z^})), which can be 
bounded by the right-hand- side of ( I3.10> at every t G (0,1). □ 

We note that if the variances of X^ and Y^ are equal for each 7 then 0^ can be replaced 
by in JTToli . 

3.2. GT interpolation and sub-additivity of the free energy. The gaussian differenti- 
ation formula ( 13. 2> permits a quick derivation of the fundamental result of Guerra and 
Toninelli 1 11 1 proving the existence of the free energy for the SK model. 

In order to state their result it is useful to include extra diagonal terms in the Hamilton- 
ian. These have a vanishingly small effect in the N 00 limit, but allow for the simplest 
statement of the theorem. 

1 ^ 

-HnIo-) = —;= Jija^aj + h- a, (3.12) 

*j=i 

where the are i.i.d. A^(0, 1) random variables. This changes the covariance matrix en- 
tries by an amount of order 0{1/N). Therefore, by Corollarv l3.3l it does not affect P{(3, h). 
Henceforth, all Pn{i3, h), etc., are defined relative to this Hamiltonian. Alternatively, one 
can define a centered Gaussian process {KN{cr) , a G {+1, —1}^} with covariance 

E{Kn{'j)Kn{<j')) = ^ql,,, (3.13) 

and then —i?Ar((T) = K]s[{a) + h- g^. 

The first application of the interpolation is the super-additivity of the quenched free 
energy 

QN{P,h) ■.= E{ln[ZN{(3,h)]) . (3.14) 
Theorem 3.4. (Guerra-ToninellifU^) For any N, M e N, 



Qjv(/3, h) + Qm{P, h) < Qn+m{P, h). 



(3.15) 
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Proof. Consider a system of size N + M with spin configurations 7 ~ {7i}i=i ■ Write 
the configuration as 7 = [a, a), with a = {aiJfL^ :— {7i}^i and a = {(Ji}f£i := 
{lN+i}iLi- The Hamiltonian for the system of + M spins is given by 

Hn+m{i-M) ^ ~Kn+m{i) ~ h- 2- (3.16) 

Guerra and Toninelli have noted the utility of considering the one-parameter family of 
Hamiltonians which interpolate between Hm+m and the sum of two independent SK 
Hamiltonians: 

HM+M{l,h]t) -KN+M{i;t) - h-j, (3.17) 

with 

KN+M{T,t) ■■= VT~t [KNia) + KMicj)] + VtKN+Mil) ■ (3.18) 
It is to be understood here that the random variables defining the interaction terms Kj^i (a), 
Km{(^), and Km+m, as in (I3.13> . are each chosen independently. The function 



i}j{t) := In 



-pHM + uilMt) 



clearly satisfies 

E(V'(0)) = Qjv(/3,/i) + Qm(/3,/i) and 



;(V'(l))-Qjv+A/(/3,/i), 



(3.19) 



(3.20) 



where E (•) above stands for integration with respect to all the random couplings in K^, 
Km, and Kj^+m- The theorem now follows if we can control the sign of 
To apply our differentiation formula, \3.5l . we note that 



dt ''''' 



N + M 



N 



M 



(3.21) 



2 ^^'^ 2 2 
For the diagonal terms, 7 = 7', the above derivative vanishes. Moreover, q^^^i is a convex 
combination of qa.a' and qa.a''- 

_ N M 
Convexity of the function f{q) = q^, allows to conclude that 

^c^,yW <o, 

and therefore, E (V'(O) is increasing by i3.5\ . This completes the proof. □ 

Theorem B .4l immediatelv implies the existence of the thermodynamic limit: 
Corollary 3.5. /) For any (3 and h. 

P{P,h) ■= lim PN{l3,h) 

exists. Furthermore, defining Vn{(3, h;Lj) — jj In Z]y{(3, h; uj), 



(3.22) 



(3.23) 



(3.24) 



N 



lim VNiP,h;cu) = P{p,h), 



where the limit is in distribution. 

ii) The pressure may also be represented as 



lim liminf — E ( In 

M^oo N^oo M 



z 



N+M 



ZN{P,h) 



(3.25) 



(3.26) 



for any (3 and h. 
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It should be noted here that prior to the GT argument it was known that the fluctuations 
of Vn{P, h; uj) are of diminishing size as oo, a fact which can be deduced by either 
martingale methods fl8l or a concentration of measure argument [241. The 'monotonicity 
of the interpolation' argument 1 1 IJ adds the last missing step, which is the convergence of 
the sequence Pjv(/3, h). 

Proof of Corollary \3.5\ The results claimed in ( I3.24> and ( I3.26> are simple conse- 
quences of ( I3.15> : namely if {QArjArgN is a super-additive sequence, then the following 
limit exists and may be calculated incrementally 

,. Qn y y . .Qn+M-Qn /o ^^^ 

lim — imi limiiii , (3.27) 

see Lemma lBTI below. 

The convergence ( I3.25t follows from ( I3.24> since, as, mentioned above, the range of 
the probability distribution of In Z]y{/3, h; ui) naiTows as ^ oo - a fact proven in 
||18||241. □ 



Remarks: 1. While (i) recovers the Guerra and Toninelli result II ll . (ii) is an observation 
which was useful in the proof of the variational principle f3 . 

2. The reader is cautioned that the super-additivity of the quenched pressure, and the 
particular direction for its monotonicity under the process of 'amalgamation' in which two 
blocks are interpolated into a single system, is not a thermodynamic principle akin to the 
Gibbs-phenomenon. For the Curie-Weiss model the inequality in (I3.15> is reversed (as a 
simple calculation will show). 

4. The ROSt variational principle 

For convenience, let us remind ourselves that the functional representing the increase in 
the free energy due to the incorporation of M spins a e {+1, —1}^^ into a ROSt /i whose 
configurations are described by {{^a}a, {qa,a'}) is 

GMh:,) - -E^ln ^"£^"^^,^,^ ) . (4.1) 

with {kc, Va.a} Gaussian random variables of covariance, 

IE(Ka«:a') = \ql^^. , (4.2) 
HV^^^V^,^,,) = Mq^^^,q,^,, . (4.3) 
Theorem 4.1. 1. For any N and any ROSt fj, 

1 E(ln ZnW, h)) < Gn{P, h- ^i) (4.4) 



with 



(4.5) 



2. The pressure is given by 



P{P,h) = \im inf Gm (A (4.6) 

M^oo /i:R,OSt 



where the limit Af — > oo also equals the supremum over M. 
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The replica expectation is just as in Corollary 13 . 21 with respect to the interpolating 



Gaussian process with covariance 
where 7 = {a, a). 



(4.7) 



Proof. Part 1.: The argument is a slight modification of the interpolation scheme de- 
scribed in Theorem 13 .41 Here we consider a system composed of a finite block of spins 
(T, whose interactions are determined by the SK model, and a reservoir of configurations 
a, whose overlaps are governed by a ROSt /i. Again, we interpolate between a decoupled 
state of the system and a state in which some interactions are allowed. The interpolating 
Hamiltonian is 



(4.8) 



-HNia,a;t) ■.= ^J\-t isTAr (cr) + //Vk„ + + /i • 7 , 

where the random couplings in KN{(y), Ha, and Va^a, defined in ( I3.13> ( 14. 2t . and ( 14.3 
respectively, are each drawn independently. 
The function 



(4.9) 



PN{P,h) and E V^(l) =Gw(/3>;a*), 



(4.10) 



is easily seen to satisfy 

e(^(0)) 

where E (•) stands for integration with respect to all random variables appearing in ( 14. 9> . 

Our differentiation formula ( 13.51 1 applies again. Letting 7 now denote pairs 7 = (a, a), 
we have —Hm{i', t) = + h- a_ where a direct calculation shows that 



(4.11) 



The covariance derivative vanishes for 7 = 7', since q^^a = Qcr,a = 1; as we saw, already 
that and the definite sign in ( 14.1 U imply monotonicity. The full statement, ( I4.5l l. follows 
by ( 13. 5> and the fundamental theorem of calculus. 

Part 2.: We now note that there exists a sequence of ROSts fif/^ for which 



1 



GM(/3,/i;A^r) = ^IE(ln 



■JN+M 



if3,h) 



.M. 
+ of — ) 

^ n' 



(4.12) 



To see that, it suffices to consider the example which has motivated the concept, namely 
the case when the ROSt, /i^^ is provided by another SK systems of N particles, with 

TV » M. 

Adapting ( 12. 8> to an increment by M we get 



In 



Zn 



= E In 



where {^a} are the weights 

= exp {j3 



1 



(4.13) 



(4.14) 
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The quantities Va^a and Ha are Gaussian variables whose covariance differs from the cor- 
responding factors in the desired variational quantity by the factor of j^^tm ^ ^ ^ W ^ 
0{ (^)^). Applying Corollarv l3.3l one may determine that 



.SK\ 1 TP A„ ZN+M{P,h) 



GM(/3>;/^r)-T7E In 



O ( f h (4.15) 



from which (|^^} follows. 

Combining this result with Corollarv l3.5l part (ii) gives part (2) of Theorem l4.1l □ 

5. HiERARCHAL RANDOM PROBABILITY CASCADES (RPC) 

In his commentary on the story of Oedipus, Andre Gide brought up the observation 
that there exist universally valid answers, which are applicable to many questions. ' A 
"universal answer", in the form of a hierarchal structure which appears to play a key role 
in various complex systems, has emerged also in the study of spin-glass models. 

In this section, we describe a family of ROSts each of which is endowed with a re- 
markable property: quasi-stationarity under a class of time evolutions which includes the 
cavity dynamics of Section M.2\ . An intriguing and relevant question is whether the class 
of examples discussed here includes all the ROSts which exhibit a robust version of quasi- 
stationarity. Before explaining the question, or conjecture, let us present the "random 
energy model" and its hierarchal extension. Both were introduced by Ruelle, as the point 
processes capturing the N ^ oo limit of Derrida's finite model calculations, and called 
the REM (for random energy model) and the GREM (for a generalized random energy 
model). Seeking a descriptive term we shall refer to these as the hierarchal "random prob- 
ability cascades" (RPC). 

5.1. The Random Energy Model (REM). The basic building block for the hierarchal 
probability cascades is the REM, or REM^; to be specific, which is the Poisson point pro- 
cess on [0, oo) with density given by Px{d£,) — — d^^^. Here a; is a parameter ranging 
over (0, 1), the minus sign is to ensure that the measure is positive, and each configura- 
tion, drawn according to the REM a;, is represented by a sequence of non-negative numbers 
denoted by {^q,(u;)}. Denoting the occupation number of a Borel set A C [0, oo) by 

Na{lo) := #{a : U^) S A}, (5.1) 

what is stated above means that for the REMj;: 

/. the occupation numbers of disjoint sets form independent random variables, 
//. the distribution of the occupation number is Poissonian: 

P {Na{u;) = fc) = e-"'^) , (5.2) 

fc! 

with p{A) the mean value: 

Px{A) = E(iVAH) = - / dr". (5.3) 

J A 

The REM a; process also appears in extreme value theory; in some probabiUty references 
it is denoted PD(a;,0), (|19|.) 



'in the case of Oedipus, "I/man", points towards the answer to the two questions which Oedipus faced at 
turning points in his life, the one posed by the Sphinx and the other on which years later he has sought the advice 
of Tiresias. A. Gide: "Oedipe" (1931). 
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By J5.3I I. for any e > 

E(^[e,oo)M) = ^. (5.4) 

It readily follows that with probability one it is possible to re-label its points in descending 
order, i.e., write {£_a{u;)} = {£,,i{uj)}^^i where 

aM>6M>---- (5.5) 

Furthermore, one has: 

Theorem 5.1. Let < x < 1, then with respect to the point process REM^, almost surely: 
i. 

«'/"Cn(u') 1. (5.6) 

n — »oo 

//. the following sum converges if and only ifv > x: 

'UuY < ^, (5.7) 



Hi. the partition function Z(lS) :— almost surely finite, with an infinitely 

divisible distribution, satisfying the addition law: Z = 2~^/^(Z + Z') where Z' is an iid 
copy of Z, 

iv. the u-moment of Z is finite if and only if u < x: 

E(Z") < cx). (5.8) 
where E represents the expectation value over REMx- 

Proof, i. On the scale oft = REM^; is a Poisson process of fixed density {— 1). Let 
N{t;u)) := N _x and let t{n;uj) be the inverse function. Then, by the Law of 

Large Numbers (or the ergodic theorem), N{t\uo)/t ^ 1, almost surely (for t oo). This 
can be rewritten as t{n; uj)/n ^ 1, which implies i5.6\ . (Estimates on the deviations can 
be deduced using the law of the iterated logarithm.) 

a. The a.s. finiteness statement ( 15. 7t can be deduced from i5.6i . or alternatively by 
splitting from the sum the finite (almost surely) collection of terms with ^ > 1, and noting 
that the main term is then of finite mean. 

The divisibility law for the distribution of Z{lu) is a direct consequence of the 
divisibility of the Poisson point process. 

iv. A simple device which facilitates the derivation of ( 15. 8t is the bound, for < w < 1: 

Ziiur < ^[oa](^)"+ E ^5.9) 

where ^[o.i] is the contribution due to ^„ € (0, 1]. With the help of the Holder inequahty, 
at p = 1 /u, applied to the first term, ( 15. 8> can be deduced by a direct calculation. □ 

5.1.1. Quasi-Stationarity of the REM. Among the more compelling attributes of the REM 
point processes, is their quasi- stationarity under the dynamics which correspond to incre- 
ments through independent factors. 

The time evolution can be described through a sequence of steps applied to a configura- 
tion {^q(cj)} generated according to a REM a; process. First, the points of the configuration 
are labeled in descending order {^q} — {■CnjJ^Li as described in \5.5\ . Here we have omit- 
ted the dependence of the sequence on the randomness oj, and we will continue to do so, 
where convenient, in the following. Next, a non-negative sequence of iid random vari- 
ables {7n}5^i is drawn independently of {£,n}, with probability distribution gidj). A 



16 



MICHAEL AIZENMAN, ROBERT SIMS, AND SHANNON L. STARR 



new configuration is obtained by multiplying ^„ by the random weights 7„ . To retain the 
monotonicity which is assumed in our notation, the resulting configuration is relabeled in 
descending order, and it therefore takes the form 

6i 77r(n)^7r(n) (5.10) 

where tt is the appropriate permutation. We also denote 

ln-=lTT(n)- (5.11) 

Thus, while 7„ is the factor by which ^„ is multiplied "going forward" in time, 7„ is the 
factor by which ^„ was increased in the last step. 

Theorem 5.2. For any x G (0, 1) and a probability distribution gid-y) of finite moment: 
(7^) :— J 7^ .9(^7) < 00, there is a constant K so that the REMx distribution is station- 
ary under the time evolution produced by the random factors {7n}, as described above, 
corrected by the factor 

K = (7")l/^ (5.12) 

in the sense that: 

{K-^in} - U„}. (5.13) 

Furthermore, the past increments {7„(u;)} form a sequence of iid random variables with 
the modified probability distribution 

?W) = 5l£W) ,5.14, 

which are also independent o/{i^„}. 

The last statement may appear paradoxical: you start with a sequence of the iid random 
variables {7„}, reshuffle them a bit, producing the permuted sequence {7^}, and the result 
is a sequence of iid variables with a different distribution! This would certainly not be 
possible for any finite collection of random variables, but it is apparently possible in the 
infinite setting due to the existence of a bottomless reservoir. 

Theorem \5.2\ The proof can be obtained through the moment generating functionals, or 
alternatively the observation that the joint distribution of the collection {(fri, 7n)} corre- 
sponds to the Poisson process in K+ x K+ with the density: —d^^^ g{dl)- The collection 
of points {(^n, 7n)} also forms a Poisson process, since its occupation numbers for disjoint 
regions of M+ x M+ are independent, and have the density —d gi^n)- It helps to 

write this density so that it becomes a probability measure in the second variable: 

-rf(^/7)-^9(d7) = -rf(^) x^^- (5.15) 

The fact that the second factor on the right hand side is a probabihty measure which does 
not depend on ^ allows to quickly read from the above the statements which are asserted 
in the Theorem. □ 

The above argument is discussed a bit more explicitly in \2V\. Theorem l5 .21 states that 
each of the REM 2; processes is invariant under the stochastic evolution up to a deterministic 
correction. A general result of Liggett 1 141 implies that such invariance in fact singles out 
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this class of processes. A strengthening of this statement was obtained in the work of Ruz- 
maikina and Aizenman II21I . For our applications, it suffices to know that the distribution 
of the relative weights is stationary, in the sense that: 

(5.16) 

where the partition function Z, resp. Z, are as introduced in Theorem lS.lK iii'). In ref. 1121 1 
this property was termed "quasi-stationarity", and it was shown there that, under certain 
limitations on the point process and the distribution of the independent weights, just this 
property limits the point process to REMj; at some value of the parameter x E (0, 1). 

5.2. The Random Probability Cascades (RPC). The REM point processes were used by 
D. Ruelle as building blocks for a hierarchal process which capture the results of Derrida's 
calculations involving the large N limit of the free energy in the so-called Generalized 
Random Energy Models. In line with Parisi's fundamental insight concerning the SK spin- 
glass model, the parameter for the construction is a monotone function x{q) taking [0, 1] 
into itself. Convenient examples, and approximations, are provided by piecewise constant 
functions. For each fc e N, a piecewise constant right-continuous function x{q) is specified 
by a pair of monotone sequences 

= Xo < Xi < X2 < ■ ■ ■ < Xk < Xk+l = 1 , 

= go < 91 < 92 <•••<% < Qk+i = 1 , (5.17) 



in particular. 



k 

xiq) ■■=J2^^^li^^'],+i)(l) (5.18) 

i=0 



with x{l) = 1. 

Following is the hierarchal construction parametrized by this data. 

i. Start with a REMj^^ process whose points are symbolically labeled as {CiV}- Here the 
subscript ai is intended to represent a label which just identifies the points; not their re- 
spective ordering. (If absolutely desired, ai could be regarded as taking values in a random 
subset of the line.) 

ii. Next, for each ai, we generate a REMj^^ process whose points are designated Ca2;ai- 
The processes corresponding to different values of ai are choosen independently. 

iii. The construction is iterated up to n = k. At the n-th step, independent versions of the 
REMa;,^ process are generated for each of the distinct values of the "address" {ai,a2, a„_ 
and the resulting points are designated as ^i"^;ai,...,Q„_i. 

The construction yields a hierarchal family of addresses of the form 

a = (ai, ...,ak) ■ (5.19) 

With each value of a, we associate 

k 
n=l 

The result of the above construction is a point process whose configurations consist of 
the collection ^(cj) {^a}, where ui - which is omitted on the right hand side- represents 
all the randomness which enters the above construction. (Specifically, all the above choices 
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can be represented by functions defined over a probability space whose points are denoted 
by 07.) 

The hierarchal addresses, which play a role in the explicit construction, can ipso-facto 
be replaced by the more generic ROSt notation, for which the information is expressed 
through the overlap kernel, which here is defined as: 

qa,a' = q,i{a,a'), n{a,a') := ma,x{j : j < k , {ai,...,aj) = {a[,...,a'j)} . (5.21) 

An overlap kernel corresponds to a hierarchal address if and only if the condition: qa.a' < 
r is transitive for each real r. The condition can equivalently be expressed as "ultrametric- 
ity" E I of the distance function dist(a, a') := I — qa,a'- 

Let us note that for P'^^ - the probability measure associated to E*^^) - a calculation 
yields l20l 

P^^Hq^a' ><Z) = x{q). 
As a direct consequence of Theorem l5.1l and Theorem l5.2l one has: 

Theorem 5.3. For k > 2, and < xi < ... < Xk < I, the partition function Z = 
X]q=(qi Qfc) is almost surely finite and in distribution satisfies: 

k 

Z ^ Z,, []E([Z,J-"-0'/""-^ (5.22) 

ri=2 

where Z^^ is a random variable having the distribution of a partition function under 
REMx^. In particular, 

E(logZ)<c». (5.23) 
The above construction yields a process whose configurations consist of the pair: 

{qa.a' {i^)} a. a') (5.24) 

of: i. a point subset of [0, oo), and ii. an overlap kernel, which conveys the genealogical 
information. Our main interest will concern the system of normalized weights, along with 
the overlaps, i.e., 

{{{UL^)/Zico)}, {qa,a'{'^)}a,a') (5.25) 

We refer to this process as the Random Probability Cascade. 

Remark: The last step in the hierarchal construction should correspond to REMj;^.^^ at 
a^fc+i — 1^ which may be seen as problematic since for x = 1 the normalization Z diverges. 
Nevertheless, for x y 1, the normalized average is well defined for all the quantities of 
interest. For simplicity of the presentation we shall not stress this point here, and approach 
the value x = 1 only as a limit. 

5.3. Quasi-stationarity of RFC. The hierarchal RPC inherits and broadens the remark- 
able quasi-stationarity property of the REM processes. In the context of RPC, the dynamics 
allow also correlated evolution of the point configuration. The construction of the evolu- 
tion is similar to that considered for the REM model, except that the random factors 7„ are 
now of the form: 

7„ = e'^^"") (5.26) 
with {rin} a collection of Gaussian random variables of covariance 

E{r]nr]n') = qa{n),a{n')^ (5.27) 

where a = a{n) is the inverse of the bijection n — n{a). 

Unlike the previous case, the dynamics are now correlated. The correlations between 
the increments of the "competing" points are determined through the overlap function, but 



MEAN-FIELD SPIN GLASS MODELS FROM THE CAVITY-ROST PERSPECTIVE 



19 



are not affected by the relative ranking of their position on the Hne, which changes in the 
course of the time evolution. 

It is important to note that the covariance condition ( I5.27t is satisfiable, i.e., the hierar- 
chal kernel qn(a),n{a') is always positive definite. To see that, it is useful to construct an 
auxiliary genealogical tree for which the ultrametric kernel coincides with the value of q at 
which the ancestral lines of a and a' spht. A Gaussian process with the covariance ( I5.27> 
is obtained by associating with each a the integral of white noise along the branches of 
the tree, in a path leading from the root to a, with the covariance E ((djy)^) = dq. Fur- 
thermore, by restricting the white noise integral to only q E [0, <], one obtains a family of 
Gaussian variables with an extra parameter t E [0, 1], rjnit), with the covariance: 

E(f7n(i)?7n'(0) = min{i, } . (5.28) 

A convenient explicit representation is obtained by presenting the Gaussian variables 
Vnia) as sums of mutually independent terms, which in the algorithm described above 
correspond to the integrals of white noise over distinct segment of the genealogical tree: 

k 

Va = Vnia) = ^ \J qi+\ - qt Zi,a , (5.29) 
1=1 

where „ are normal Gaussian variables with the covariance 

E{Zi^aZj,ai) = dij I[qa,a' > <7i+l] • (5.30) 

For a simple statement of the quasi-stationarity, a relevant class of function is defined 
by the Lipschitz norm: 

II /II iV^(x) - ipiy)\ 

miLip := sup i i (5.31) 

x,y \x y\ 

Theorem 5.4. Under the dynamics described above, for any ip of bounded Lipschitz norm, 
the configuration which results from the above dynamics has the same distribution as the 
process obtained by multiplying {£,a} by a constant, e 



■00 • 



{^^{lo)} = {e'^'U^)} , (5.32) 

with ipQ described below. In particular, the partition function satisfies 

Z^e'^oZ. (5.33) 

and the process is quasi-stationary, in the sense that the distribution of the relative weights 
{^a{^)/Z{uj)} is stationary, satisfying the appropriate version of eq. \5.\6\ . 

Proof. This statement can be obtained by a direct iteration of the quasi-stationarity prop- 
erty of the REM processes which are used in the construction of the RPC. It is convenient 
to define the partial quantities, for any j = 1, . . . , fc: 

j 3 

4'^ n^i:U....,o„-i (5.34) 

n=l i=0 

Conditioning on the collection of variables and and for each ai, ak-i 

let us consider the evolution for the corresponding subtree which corresponds to multipli- 
cation by 



20 



MICHAEL AIZENMAN, ROBERT SIMS, AND SHANNON L. STARR 



, ak-i- 



(5.36) 



By Theoi'em l5.2l for each subfamily corresponding to a specified ai, . 

St- <;^|" — \ /fc <;Q!j,;Qi,...,afc_i / — ]\lk / SQfc;ai,...,c 

where (•) represents integration with respect to the variables Zk^a- 

The above procedure of conditioning and averaging may be iterated. Starting from: 
ipk{y) = ip, and denoting by Ez(-) the average over the normal gaussian random variable 
z, we define recursively for j : fc \ 0: 

1 



In 



E. 



(5.37) 



It is easy to check that under the Lipschitz condition on t/; the iteration step is well de- 
fined, and, furthermore, the Lipschitz norm does not increase under the mapping ipj(-) i-^ 
One obtains 

V 



In this sequence, the deterministic quantity appearing in ( I5.32t is 



In 



E, 



^g'/'i(v9T^)^ 



(5.38) 



(5.39) 



□ 



The deterministic value of tpo can be alternatively characterized through the solution 



of a specific partial differential equation. 
/ — f{q, y) which satisfy, for t £ [0, 1] 



Namely, consider functions of two variables 



9V 
dy"^ 



= 0, 



(5.40) 



dq 

with the t = 1 boundary condition: 

/(l,y) ==ln[cosh[/3(y + /i)]]. (5.41) 

One may note that the function x{q) enters here as a parameter for the partial differential 
equation. Going backward in time, the equation is particularly simple to solve over inter- 
vals where where x{q) is constant. Using the Cole-Hopf transformation, on which more 
is said next, the solution is provided by the iterative procedure which is described in the 
above proof. From this perspective, the value of i/jq corresponds to tpo = /(O, 0; x). We 
shall now expand on this point. 

5 .4. Quasi-stationarity of RPC in terms of the Parisi equation. An alternative perspec- 
tive on Theorem l5.4l is provided by a continuous time version of the quasi-stationarity. As 
it turns out, equation (I5.40> . which plays a key role in the Parisi solution, appears also as a 
Martingale condition for the cavity dynamics with respect to the RPC hierarchal ROSt. 

For a given function tp consider the two parameter function f{t, y) — f{t, y; x), which 
satisfies the boundary conditions: 

fil,y)^^{y + h). (5.42) 

and the partial differential equation (I5.40> . which is to be solved from q = 1 down to 

q = 0. 



Theorem 15 .41 admits the following extension, about which we learned from D. Ruelle. 
For simplicity it is implicitly assumed here that the function is suitably differentiable and 
bounded. Upon closer analysis, it suffices to assume the Lipschitz condition, as in Theo- 
remlsH 
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Then the probability distribution of the ROST configuration [t] lj) is independent oft. In 
particular, it coincides with that of 6"^° {^) where the deterministic factor is 



The Statement can be proved along the lines of the above proof of Theorem l5.4l or in 
terms of stochastic PDE and Ito's formula. We refer the reader for further details on the 
latter perspective to ||3| . 

Over intervals of constant x{q) the differential equation can be solved through the con- 
volution of the function e^^'^'^'>^^^'^'> with suitable Gaussian measures. This is a slight varia- 
tion of the well-known Cole-Hopf transform familiar in the context of nonlinear integrable 



In the special case of a;( ) constant over the entire interval (0, 1) the RPC is really a 
REMj^. In this situation, one readily verifies that the solution of ( I5.40t derived through 
the Cole-Hopf transformation, starting with the boundary conditions ( I5.42t . at t = 1, is 
exactly what one would obtain using ( I5.13> . For piecewise constant x{q) this argument can 
be employed in steps, to again conclude that the PDE formulation matches with the results 
of an iteration of Theorem 15. 21 i.e., (I5.38> . Subdividing the intervals into short segments 
the statement can also be easily understood from the perspective of Ito's formula, as is 
discussed more explicitly in |3 1. 

The formulation of the solution in terms of the differential equation has the advantage 
of being well defined even when the piecewise constant x{q) is replaced by a continuous 
function. For the existence of the continuum limit it is imperative to restrict the attention 
to the ROSt given by the normalized weights, as in (I5.25> . 

Let us now return to the spin glass model for whose solution the above plays a key role. 



6.1. The Parisi formula. The partial differential equation, (I5.40> has made its appearance 

in the work of Parisi on the SK model, in the context of rather different considerations. 
Without reviewing here Parisi's approach, and his hierarchal ansatz for replica symme- 
try breaking, let us present the resulting conjecture for the free energy, a.k.a. the 'Parisi 
solution' . 

Introducing the ansatz of hierarchal pattern of replica symmetry breaking - a concept 
for which the reader is referred to [17. .15J - Parisi has introduced the idea that the order 
parameter for the SK model is a monotone function, x : [0, 1] i-^ [0, 1]. Somewhat anal- 
ogously to the much simpler case of the Curie Weiss mean field ferromagnetic model, the 
value of the order parameter can be characterized through either self consistency, based 
on the cavity analysis of the cavity dynamics (discussed in Chapters 4 and 5 of II15I X or 
through a variational principle. That has led Parisi to investigate solutions / = f{q, y) of 
the partial differential equation 



^0 



/(0,0) 



(5.44) 



PDE's. 



6. Relation with the Parisi solution 





subject to the boundary condition 



/(l,y)=ln[cosh[/3(2/ + /i)]] . 



(6.2) 



22 



MICHAEL AIZENMAN, ROBERT SIMS, AND SHANNON L. STARR 



The resulting value of /(O, 0) = /(0, 0; x) is incorporated in the Parisi functional, which 
is defined as: 

P[x] := ln[2] + /(0,0;a;)- qx{q)dq. (6.3) 

The end result is Parisi's proposal that: 

lim i-logE(Z^v)) = infP[x] Gpar^s^ ■ (6.4) 

N^co iV x{-) 

where the infimum is over monotone functions of the unit interval with values in [0,1]. 

The remarkable arguments of Parisi are still beyond mathematical analysis, but its main 
conclusion is now known to be correct. 

In a surprising development, F. Guerra 1101 proved: 

Lemma 6.1 (Guerra variational principle). 

inf P[x] < Gparisi ■ (6.5) 

a:(-) 

The analysis, which employs an interpolation argument, yields also a criterion for the 
saturation of the inequality. The statement was given a different form in our work Q: 
the variational principle was generalized into infimum of the functional G(/3, h\ fi) over 
ROSt's (/i), and it was shown that in that generality the infimum yields the correct value 
(Theorem l4.H . Independently of that, M. Talagrand |25 1 has proven that the Parisi conjec- 
ture is correct. The proof employs the criterion provided by Guerra's analysis, and insights 
supported by a heavy dosage of calculus. 

We shall now show how Guerra's variational principle is incorporated in the ROSt 
bound, dMl of TheoremlllT] 



6.2. The free energy of Hierarchal ROSts. For the hierarchal RPC, the calculation of the 
ROSt functional Gm{P, h; ji) = G]\i{fi) is greatly facilitated by their quasi-stationarity 
property. We shall now demonstrate that the free energy functional corresponding to the 
RPC of a given function x{q) is independent of M and coincides with Parisi's functional 
Pfd. of JO. 

The ROSt free energy functional, which is defined in ( 14. H . can be written as 

GM(M) = G«(M)-Gif(M) (6.6) 

with 



M 

and 

j^. I in 

M 



4^)(M) = ^E(ln 



(6.7) 



(6.8) 



Lemma 6.2. Let ^ be a ROSt having weights generated by an RPC with parameter x — 
{xi , . . . , a;„) and overlap function q. Then for any M G N." 

GiV(M) = ln[2] + /(0,0;x), (6.9) 

and 

Gmip) = Y / <l<l)dq- (6.10) 
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In particular, the free energy functional coincides with the Parisi functional at x, i.e., 



GM{f3,h;n) = P[x]. 
Proof. Summing over the spins a, we cast G^^j in the form 



GiiV) = ln[2] 



M 



-E In 



where 



ip{Vt,a) ■= ln[cosh[/3(?7j^Q, + h)]] 
and rii,a are Gaussian variables with the covariance: 



(6.11) 



(6.12) 



(6.13) 



(6.14) 



The quasi-stationarity of the ROSt readily implies that the contributions of the independent 
factors e'^^''^^'^'^ of the right hand side in ( I6.12t factorizes, and thus g'^^j is independent of 
M. Furthermore, by Theorem l5.5l we see that 



(6.15) 



This proves (16. 9> . 



We calculate G\,j by interpolation: For any t e [0, 1], define the function 



Note that 

^(1) 

Using Lemma im we see that 




F'{t) 





x{q)qdq. 



(6.16) 



(6.17) 



(6.18) 



(6.19) 



In ( I6.18t we have used quasi-stationarity of to remove the dependence on t. Equation 
j6.10> follows through the integration of F'[t). □ 

6.3. An Open Problem: Explaining the validity of the Parisi ansatz. As was mentioned 
above, it is now a Theorem, proven by M. Talagrand |25 1, that Parisi's ansatz indeed yields 
the correct solution for the free energy of the SK model. However, it still seems reasonable 
to say that an "explanation" of the reasons for the validity of the Parisi ansatz continues 
to present an open challenge. Could RPC's be the only 'robustly' quasi-stationary ROSt's, 
and could the validity of Parisi's ansatz be explained by that? Can one formulate some 
other fundamental reason for the validity of the Parisi calculation? Given the versatility of 
the applications of the Parisi approach, it may be of interest to shed more light on any of 
these questions. 
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Appendix A. The Gaussian Differentiation Lemma 

Lemma A.l. Let Xt G M", t G (0, 1), be a vector-valued Gaussian process, with co- 
variance C't which is continuously differentiable. Suppose that ip 
continuously differentiable and compactly supported. Then 

Proof. The joint density function for Xt is 

cxp(-i(a;,Cr^a;)) 



IS twice 



(A.l) 



pAx) - , 

v/det(27rC7t) 

In terms of the Fourier transform, /(fc) = e~^'^*^'^'^^/(a;) dJ^x, 



ij){x) pt[x) dx 



ijj{k) pt{k) dk . 



(A.l) 



(A.3) 



(by Plancherel theorem). Since pt{k) = exp(— 27r^(A;, Ctk)), a direct calculation shows 

d. 



dt 



2ti^ / {k,Ctk)tl){k)pt{k)dk. 



But, since (V?/')"(fc) ~ 27rifc we see that 



-2^\k,Ctk)i;{k) - -((V,(7tV)V)"(fe). 



(A.4) 



(A.5) 



So, by Plancherel's theorem again. 



dt 



{{v,Ct\7)^yik)ptik)dk 



-E 



{{W,CtW)^){Xt 



(A.6) 



□ 



We need the following extension of this result to a wider class of functions which is 
enabled by a density argument. 

Corollary A.l. Let Xt be as in Lemma [Ol Suppose e C^(R") and tp, Vtp, V^V G 

X^(M" , pt) for every t G (0, 1). Also suppose that 

(t^E [\^iXt)\ + ||V^(XO|| + ||VV(Xt)ll] ) e 1)) . 

Then ¥,[ip{Xt)] is absolutely continuous and i lA.it holds for almost all t G (0, 1). 

Proof. Let ry ; M" ^ M be any smooth function, with compact support, such that < ry < 
1 and such that ry(0) = 1. Define 

ipe{x) = r]{ex)^lj{x) , 

for each e > 0. So is twice continuously differentiable, and with compact support. Also, 
^ ip and V^V'e ~^ V^?/", pointwise, as e — > 0. Finally, we know that |V'e(a;)| < iV^l^c)] 
for all X G M" , and 



\W^Mx)\\ <K{\^j{x)\ + \\W,jj{x)\\ + \\W^^{x)\\) , 



(A.7) 



for some constant K < oo. (The constant depends only on the sup norm of || V?7(a;)|| and 
||V2r?(x)||.) 
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By Lemma lATI integrating, 



((V,CtV)^,)(Xt) 



dt. 



for each ti,t2 G (0, 1) and all e > 0. By the dominated convergence theorem, 

for every t E (0, 1). In particular, it is true at t = ti and t = t2- Similarly, by the 
dominated convergence theorem 



limE 



((v,CiV)v)(xo 



for every t E [ti, 12]- But, moreover, the integral of the upper bound in (IA.7> . integrated 
against pt, is a function of t which is locally integrable, by our hypothesis. Therefore, we 
can apply the DCT to the t-integral, itself, to determine 



lim 



((V,CtV)^e)(XO 



dt 



E 



So 



{{V,CtV)i>){Xt) 
{{V,CtV)ij){Xt 



dt. 



dt. 



1 

Since this is true for every ii, <2 G (0, 1), Lebesgue's differentiation theorem implies the 
corollary. □ 



Appendix B. Limits for super- additive sequences 

In the proof of Theorem l4.1l we made use of the following known statement. For com- 
pleteness we present its proof. 

Lemma B.l. Let {QN^Nev, be a super-additive sequence of real numbers, in the sense 
that for any N, M e N, 

Qn + Qm^Qn+m- (B-1) 
Then the fallowing limit exists, with value in R U {00}, and satisfies 

'^N Qn 



lim 

N^oo N 



sup ■ 



Moreover, 



,. Qn r ■ f Qn+m - 
lim = iim lim mi 



IN 



Proof. Let M S N. For any integer A^ > M, one may write A^ 
1 < A: < M, and by super-additivity. 



Thus 



y ■ f Qn ^ Qm 
iim mi > 

AT^oo N ~ M 



,. Qn ^ Qm ^ ,. . _p Qn 
lim sup — — < sup — — — < lim mi — — . 

W^oo N - M M - JV^oo A^ ■ 



(B.2) 
(B.3) 

n ■ M + k with 
(B.4) 
(B.5) 



from which ( IB.2> follows. 

The proof of iB.3\ follows by demonstrating two inequalities. An immediate conse- 
quence of iB.l\ . is 

Qm ^^^.^^Qn+m^Qn^ (B.6) 



M 



N- 



M 
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and a lower bound, which is part of the claim (IB.3> . follows from the fact that the limit in 
iB.2\ exists. 

The matching upper bound may be obtained by noting that for any n G N 



I M+N - Qn Si=l *^+^ ~ '3 0-1) M+N 



nM + N nM + N 

and for each j ~ 1, . . . ,n 



(B.7) 



Qj M+N — Q{j-i) M+N > inf [Qk+M — Qk]- (B.8) 

k> N 



Inserting JB.8> into iB.7\ . taking n — > cx3, and then the supremum over N, we arrive at 

lim gli>liminf Q^+^ -Q^ , (B.9) 

n— »oo 71 N^oo M 

which completes the proof of ( IB.3> . □ 



Appendix C. General Interactions 

In this appendix, we will illustrate that the results provided in the main text for the SK 
Hamiltonian have a simple analogue for more general Hamiltonians. As was done in |2|, 
we will demonstrate that our analysis also holds for models of the type 

HN{(J,h) := -Kn{(j) - h-a, (C.l) 

where the interaction term K]y{a) is now taken to be a centered Gaussian process, indexed 
by the spins a, with the covariance 

N 

E{Kn{'J)Kn{'j')) = -f{q,^^,). (C.2) 

Here, for convenience, / is written as a function of the spin overlap. We will assume that 
/ is a positive power series; i.e., f{q) :— WA'^q^ on [—1, 1] with the normalization 

ki-P — 1- An explicit realization of such an interaction Kpj is given in terms of the 
multi-spin interaction: 

Y J2 J^,...^r^^, ' ' ' (C.3) 

r— 1 — 1 

where J := { Jii....,i^} is a family of independent normal Gaussian variables. 

For the results discussed here, we further assume that / is convex on [—1,1]. The 
importance of such a condition has been recognized in the literature, e.g. in \\T\ convexity 
was used to prove convergence for the free energy density, in the limit N ^ oo. Derrida's 
p-spin models |8 1 are obtained by the special choices f{q) — q^ forp e N, and for these 
convexity holds if p e 2N. In particular, setting p = 2, one recovers the SK model, except 
that in contrast to ( II. Il l the tensor in ( IC.3l l need not be symmetric. For convenience we also 
include here diagonal terms, but these do not affect the results. 

In the analysis of the free energy it is convenient to first assume that the second de- 
rivative of / is continuous up to the boundary, and then use continuity arguments for an 
extension of the results. We proceed under this additional assumption. 

The analogue of Corollarv 13 . 5 1 and Theorem l4. 1 I hold for Hamiltonians defined with the 
Gaussian interactions given by JC.2> . 
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Theorem C.l. For any f3 and h, define Pn{P, h) ~ jjQNiP, h) relative the Hamiltonian 
given by itTTl . Then, 



exists, and moreover. 



almost surely. 



P{l3,h) lim PN{P,h), (C.4) 



lim VN{P:h-uo) = P{l3,h), (C.5) 



Proof. With the very same interpolation scheme ( I3.17> and ( I3.18> . excepting that the ran- 
dom variables are now defined via ( IC.2> . one derives 

d , , N + M , , N , , M , 

^a.r'W = -^—fiqr^r') - y/l^a.^') - -jfiqa^a'), (C.6) 

in place of ( I3.21> . Superadditivity, as before, follows from the convexity / and ( I3.22> . □ 



For the Hamiltonian given by dC.H . one may also develop a cavity perspective by per- 
forming the change in free-energy analysis as described in Sections |2] and |3] Using the 
definition of the interactions (IC.2> . the covariance of a system of + Af spins 7 = (a, cr) 
is given by 

K{KN+M{l)KN+M{i)) = ^^^^/(9x7'), (C.7) 

where we have adopted the notation used in Section |2] To first order, the overlap of the 
combined system may be expressed in terms of the overlaps within the two blocks as 

q-rn' = qa-a' + iq<y,<y' ~ ) "/y +*^(("/v) ) ' ^^'^^ 
see equation ( I3.22> . Taylor expansion of the function /, again to first order, yields 

N + M , ^ N , , M , , M , /M2\ 

— ^ — /(97,7') - Y sKq^.a') = - Y<p(9a,a') + Y J + '^v^ ) ' 

(C.9) 

where 

Hq) qf'iq) - fiq)- (cio) 

Now, given a ROSt fi, one may define two sets of independent, centered gaussian random 
variables {kq} and {Kj.cr}, which are attuned to the more general Hamiltonian dC.U . As 
indicated by iC.9\ . these random variables are defined by prescribing their covariances as 
follows: 

= (C.ll) 



where cj) is as defined in (IC.10> . and 

EiVa^aV^,,^,) = y/'(9"."' (C.12) 

(the positivity of the covariance can be concluded from the representation <C.3> . fT\). Cor- 
respondingly, a free energy functional, analogous to ( 14. H . may be defined as 



(C.13) 



With these new definitions, one may derive a variational principle analogous to The- 
orem Moreover, as in Theorem 14. II ROSts formed by N particle systems with the 
Hamiltonian dC.ll l may be used to demonstrate that the inequality actually saturates. Through 
an adaptation of the methods discussed above one can prove: 
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Theorem C.2. Let P>0 and heR. 

i) For any M e N, 

Pm {P.h)< inf Gm iP,h;fi). (C. 14) 

/^:ROSt 

ii) The pressure of the system corresponding to \C.1\ may be realized through: 

P((3,h)^ lim inf GM{P,h;fi). (C.15) 

M^oo /^:ROSt 

For further discussion the reader is referred to l2l . 
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